National Repository of Grey Literature 16 records found  1 - 10next  jump to record: Search took 0.01 seconds. 
Extracting text data from the webpages
Troják, David ; Morský, Ondřej (referee) ; Červenec, Radek (advisor)
This work deals with text mining from web pages, an overview of available programs and its methods of text extraction. Part of this work is the program created in Java language, which allows text to obtain data from specific web pages and save them into XML file.
Framework for Information Exctration from WWW
Brychta, Filip ; Bartík, Vladimír (referee) ; Burget, Radek (advisor)
Web environment has developed into the largest source of electronic documents, so it would be very useful, to process this information automatically. This is however not a trivial problem. Most documents are written in HTML (Hypertext Markup Language), which does not support semantic description of the content. The goal of this work is to create modular system for information extraction and further processing of this information from HTML documents. Further processing of information means to store this information in XML document or relational database. System modularity makes it possible to use various information extraction and storing methods, thus the system can be used for various tasks.
Data Extraction from Product Descriptions
Sláma, Vojtěch ; Očenášek, Pavel (referee) ; Burget, Radek (advisor)
This work concentrates on the design and implementation of an automated support for data extraction from product descriptions. This system will be used for e-shop purposes. The work introduces present approaches to information extraction from HTML documents. It focuses chiefly at wrappers and methods for their induction. The visual approach to information extraction is also mentioned. System requirements and basic principles are described in the design part of the work. Next, a detailed description of a path tracing algorithm in document object model is explained. The last section of the work evaluates the results of experiments made with the implemented system.
Tools and Techniques for Creating Mobile Applications
Čtvrtníček, Dušan ; Žák, Jakub (referee) ; Samek, Jan (advisor)
This bachelor thesis describes and compares the possibilities for development of mobile applications for various mobile operating systems (Android, Windows Phone etc.). The thesis also describes the tools available to create mobile applications. Based on the created application, there were compared and evaluated two developments - native and hybrid.
Extraction of Semantic Relations from Text
Pospíšil, Milan ; Schmidt, Marek (referee) ; Smrž, Pavel (advisor)
Today exists many semi-structured documents, whitch we want convert to structured form. Goal of this work is create a system, that make this task more automatized. That could be difficult problem, because most of these documents are not generated by computer, so system have to tolerate differences. We also need some semantic understanding, thats why we choose only domain of meeting minutes documents.
Extracting text data from the webpages
Mazal, Zdeněk ; Morský, Ondřej (referee) ; Fojtová, Lucie (advisor)
This work focus at data and especially text mining from Web pages, an overview of programs for downloading the text and ways of their extraction. It also contains an overview of the most frequently used programs for extracting data from internet. The output of this thesis is a Java program that can download text from a selection of servers and save them into xml le.
Extensible Provider for Windows Powershell
Závišek, Josef ; Ježek, Pavel (advisor) ; Obdržálek, David (referee)
This thesis deals with the design and implementation of an extensible provider for Windows PowerShell. This provider allows registering the adapters which provide access to various data stores. The thesis gives an introduction into PowerShell and outlines how to realize new extensions. It then elaborates the architecture of the provider in detail. Next part is devoted to the design and implementation of the adapter for compressed files. For this purpose, the SevenZip library is used which had to be adapted for the use from C# language. Therefore, the thesis also includes description of the wrapper allowing the library utilization from the managed code.
Extraction of Semantic Relations from Text
Pospíšil, Milan ; Schmidt, Marek (referee) ; Smrž, Pavel (advisor)
Today exists many semi-structured documents, whitch we want convert to structured form. Goal of this work is create a system, that make this task more automatized. That could be difficult problem, because most of these documents are not generated by computer, so system have to tolerate differences. We also need some semantic understanding, thats why we choose only domain of meeting minutes documents.
Extensible Provider for Windows Powershell
Závišek, Josef ; Ježek, Pavel (advisor) ; Obdržálek, David (referee)
This thesis deals with the design and implementation of an extensible provider for Windows PowerShell. This provider allows registering the adapters which provide access to various data stores. The thesis gives an introduction into PowerShell and outlines how to realize new extensions. It then elaborates the architecture of the provider in detail. Next part is devoted to the design and implementation of the adapter for compressed files. For this purpose, the SevenZip library is used which had to be adapted for the use from C# language. Therefore, the thesis also includes description of the wrapper allowing the library utilization from the managed code.
Extracting text data from the webpages
Troják, David ; Morský, Ondřej (referee) ; Červenec, Radek (advisor)
This work deals with text mining from web pages, an overview of available programs and its methods of text extraction. Part of this work is the program created in Java language, which allows text to obtain data from specific web pages and save them into XML file.

National Repository of Grey Literature : 16 records found   1 - 10next  jump to record:
Interested in being notified about new results for this query?
Subscribe to the RSS feed.